Linguistic Processor Training on Speaker Data for Unit Selection Text-to-Speech

نویسنده

  • Tetyana Lyudovyk
چکیده

This paper describes an approach to synthesizing personalized speech while maintaining not only speaker voice but also speaker pronunciation peculiarities. Personalization is realized by means of pronunciation models trained on speaker data contained in his/her speech database. Untrained models allow to synthesize speech in neutral normative style. On the segmental level, the transcription model is used. On the prosodic level, models for phrasing, intonation, pause and phoneme duration are used. These prosodic models are derived from comparative acoustic-phonetic study of different speakers’ data contained in several speech corpora and databases. Personalizing of pronunciation models is carried out during the off-line training of linguistic processor using a speech database annotation. During the on-line speech synthesis mode, personalized pronunciation models are used by the linguistic processor to generate speaker specific target specification of input text.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unit Selection Speech Synthesis Using Phonetic-Prosodic Description of Speech Databases

This paper describes an approach to speech synthesis based on using speech databases at different stages of TTS process. Speech database units are phones in different segmental and prosodic contexts. Pitch synchronous segmentation and labeling of databases allows storing both segmental and prosodic information. Phonetic-prosodic annotations of speech databases are involved in off-line training ...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Exploiting Alternatives for Text-To-Speech Synthesis: From Machine to Human

The absence of alternatives/variants is a dramatical limitation of text-tospeech synthesis compared to the variety of human speech. This paper introduces the use of speech alternatives/variants in order to improve text-to-speech synthesis systems. Speech alternatives denote the variety of possibilities that a speaker has to pronounce a sentence depending on linguistic constraints, specific stra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006